75 research outputs found
Completing Low-Rank Matrices with Corrupted Samples from Few Coefficients in General Basis
Subspace recovery from corrupted and missing data is crucial for various
applications in signal processing and information theory. To complete missing
values and detect column corruptions, existing robust Matrix Completion (MC)
methods mostly concentrate on recovering a low-rank matrix from few corrupted
coefficients w.r.t. standard basis, which, however, does not apply to more
general basis, e.g., Fourier basis. In this paper, we prove that the range
space of an matrix with rank can be exactly recovered from few
coefficients w.r.t. general basis, though and the number of corrupted
samples are both as high as . Our model covers
previous ones as special cases, and robust MC can recover the intrinsic matrix
with a higher rank. Moreover, we suggest a universal choice of the
regularization parameter, which is . By our
filtering algorithm, which has theoretical guarantees, we can
further reduce the computational cost of our model. As an application, we also
find that the solutions to extended robust Low-Rank Representation and to our
extended robust MC are mutually expressible, so both our theory and algorithm
can be applied to the subspace clustering problem with missing values under
certain conditions. Experiments verify our theories.Comment: To appear in IEEE Transactions on Information Theor
High-quality Image Restoration from Partial Mixed Adaptive-Random Measurements
A novel framework to construct an efficient sensing (measurement) matrix,
called mixed adaptive-random (MAR) matrix, is introduced for directly acquiring
a compressed image representation. The mixed sampling (sensing) procedure
hybridizes adaptive edge measurements extracted from a low-resolution image
with uniform random measurements predefined for the high-resolution image to be
recovered. The mixed sensing matrix seamlessly captures important information
of an image, and meanwhile approximately satisfies the restricted isometry
property. To recover the high-resolution image from MAR measurements, the total
variation algorithm based on the compressive sensing theory is employed for
solving the Lagrangian regularization problem. Both peak signal-to-noise ratio
and structural similarity results demonstrate the MAR sensing framework shows
much better recovery performance than the completely random sensing one. The
work is particularly helpful for high-performance and lost-cost data
acquisition.Comment: 16 pages, 8 figure
Face Recognition from Sequential Sparse 3D Data via Deep Registration
Previous works have shown that face recognition with high accurate 3D data is
more reliable and insensitive to pose and illumination variations. Recently,
low-cost and portable 3D acquisition techniques like ToF(Time of Flight) and
DoE based structured light systems enable us to access 3D data easily, e.g.,
via a mobile phone. However, such devices only provide sparse(limited speckles
in structured light system) and noisy 3D data which can not support face
recognition directly. In this paper, we aim at achieving high-performance face
recognition for devices equipped with such modules which is very meaningful in
practice as such devices will be very popular. We propose a framework to
perform face recognition by fusing a sequence of low-quality 3D data. As 3D
data are sparse and noisy which can not be well handled by conventional methods
like the ICP algorithm, we design a PointNet-like Deep Registration
Network(DRNet) which works with ordered 3D point coordinates while preserving
the ability of mining local structures via convolution. Meanwhile we develop a
novel loss function to optimize our DRNet based on the quaternion expression
which obviously outperforms other widely used functions. For face recognition,
we design a deep convolutional network which takes the fused 3D depth-map as
input based on AMSoftmax model. Experiments show that our DRNet can achieve
rotation error 0.95{\deg} and translation error 0.28mm for registration. The
face recognition on fused data also achieves rank-1 accuracy 99.2% , FAR-0.001
97.5% on Bosphorus dataset which is comparable with state-of-the-art
high-quality data based recognition performance.Comment: To be appeared in ICB201
Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning
It is well believed that video captioning is a fundamental but challenging
task in both computer vision and artificial intelligence fields. The prevalent
approach is to map an input video to a variable-length output sentence in a
sequence to sequence manner via Recurrent Neural Network (RNN). Nevertheless,
the training of RNN still suffers to some degree from vanishing/exploding
gradient problem, making the optimization difficult. Moreover, the inherently
recurrent dependency in RNN prevents parallelization within a sequence during
training and therefore limits the computations. In this paper, we present a
novel design --- Temporal Deformable Convolutional Encoder-Decoder Networks
(dubbed as TDConvED) that fully employ convolutions in both encoder and decoder
networks for video captioning. Technically, we exploit convolutional block
structures that compute intermediate states of a fixed number of inputs and
stack several blocks to capture long-term relationships. The structure in
encoder is further equipped with temporal deformable convolution to enable
free-form deformation of temporal sampling. Our model also capitalizes on
temporal attention mechanism for sentence generation. Extensive experiments are
conducted on both MSVD and MSR-VTT video captioning datasets, and superior
results are reported when comparing to conventional RNN-based encoder-decoder
techniques. More remarkably, TDConvED increases CIDEr-D performance from 58.8%
to 67.2% on MSVD.Comment: AAAI 201
- …